5 research outputs found
Learning Multi-Object Positional Relationships via Emergent Communication
The study of emergent communication has been dedicated to interactive
artificial intelligence. While existing work focuses on communication about
single objects or complex image scenes, we argue that communicating
relationships between multiple objects is important in more realistic tasks,
but understudied. In this paper, we try to fill this gap and focus on emergent
communication about positional relationships between two objects. We train
agents in the referential game where observations contain two objects, and find
that generalization is the major problem when the positional relationship is
involved. The key factor affecting the generalization ability of the emergent
language is the input variation between Speaker and Listener, which is realized
by a random image generator in our work. Further, we find that the learned
language can generalize well in a new multi-step MDP task where the positional
relationship describes the goal, and performs better than raw-pixel images as
well as pre-trained image features, verifying the strong generalization ability
of discrete sequences. We also show that language transfer from the referential
game performs better in the new task than learning language directly in this
task, implying the potential benefits of pre-training in referential games. All
in all, our experiments demonstrate the viability and merit of having agents
learn to communicate positional relationships between multiple objects through
emergent communication.Comment: 15 page
ImageManip: Image-based Robotic Manipulation with Affordance-guided Next View Selection
In the realm of future home-assistant robots, 3D articulated object
manipulation is essential for enabling robots to interact with their
environment. Many existing studies make use of 3D point clouds as the primary
input for manipulation policies. However, this approach encounters challenges
due to data sparsity and the significant cost associated with acquiring point
cloud data, which can limit its practicality. In contrast, RGB images offer
high-resolution observations using cost effective devices but lack spatial 3D
geometric information. To overcome these limitations, we present a novel
image-based robotic manipulation framework. This framework is designed to
capture multiple perspectives of the target object and infer depth information
to complement its geometry. Initially, the system employs an eye-on-hand RGB
camera to capture an overall view of the target object. It predicts the initial
depth map and a coarse affordance map. The affordance map indicates actionable
areas on the object and serves as a constraint for selecting subsequent
viewpoints. Based on the global visual prior, we adaptively identify the
optimal next viewpoint for a detailed observation of the potential manipulation
success area. We leverage geometric consistency to fuse the views, resulting in
a refined depth map and a more precise affordance map for robot manipulation
decisions. By comparing with prior works that adopt point clouds or RGB images
as inputs, we demonstrate the effectiveness and practicality of our method. In
the project webpage (https://sites.google.com/view/imagemanip), real world
experiments further highlight the potential of our method for practical
deployment
A comparative analysis of aerosol microphysical, optical and radiative properties during the Spring Festival holiday over Beijing and surrounding regions
Using ground-based data, meteorological observations, and atmospheric environmental monitoring data, a comparative analysis of the microphysical and optical properties, and radiative forcing of aerosols was conducted between three stations in different developed environments during a severe air pollution episode during the Spring Festival over Beijing. During the most polluted period, the daily peak values of the aerosol optical depth were ~1.62, ~1.73, and ~0.74, which were about 2.6, 2.9, and 2.1 times higher than the background levels at the CAMS, Xianghe, and Shangdianzi sites, respectively. The daily peak values of the single scattering albedo were ~0.95, ~0.96, and ~0.87. The volume of fine-mode particles varied from 0.04 to 0.21 µm3 µm-2, 0.06 to 0.17 µm3 µm-2, and 0.01 to 0.10 µm3 µm-2, which were about 0.3 to 5.8, 1.1 to 4.7, and 1.2 to 8.9 times greater than the background values, respectively. The daily absorption aerosol optical depth was ~0.01 to ~0.13 at CAMS, ~0.03 to ~0.14 at Xianghe, and ~0.01 to ~0.09 at Shangdianzi, and the absorption Ångström exponents reflected a significant increase in organic aerosols over CAMS and Xianghe and in black carbon over Shangdianzi. Aerosol radiative forcing at the bottom of the atmosphere varied from -20 to -130, -40 to -150, and -10 to -110 W m-2 for the whole holiday period, indicating the cooling effect. The potential source contribution function and concentration-weighted trajectory analysis showed that Beijing, the southern parts of Hebei and Shanxi, and the central northern part of Shandong contributed greatly to the pollution
Cu(II)/Proline-Catalyzed Reductive Coupling of Sulfuryl Chloride and P(O)–H for P–S–C Bond Formation
A considerably
improved method for the Cu-catalyzed coupling of
sulfuryl chloride with PÂ(O)–H was described. Using commercially
available l-proline as the ligand decreased the precatalyst
loading, broadened the substrate scope and greatly promoted the efficiency
of the coupling reaction. Moreover, gram-scale preparation, easy-to
handle and recyclable catalyst featured this transformation